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1 Sentinel.^ 

Scott A. Mahlke, William Y. Chen, Roger A. Bringmann, Richard E. Hank, Wen-Mei W. Hwu, B. 
Ramakrishna Rau, Michael S. Schlansker 

November 1993 ACM Transactions on Computer Systems (TOCS), volume 11 Issue 4 

Full text available: ^pciff226 MS) Additional Information: full citation , abstract , references , citings , index terra 

Speculative execution is an important source of parallelism for VLIW and superscalar 
processors. A serious challenge with compiler-controlled speculative execution is to 
efficiently handle exceptions for speculative instructions. In this article, a set of architectural 
features and compile-time scheduling support collectively referred to as sentinel scheduling 
is introduced. Sentinel scheduling provides an effective framework for both compiler- 
controlled speculative executi ... 

Keywords: VIIW processor, exception detection, exception recovery, instruction scheduling, 
instruction-level parallelism, speculative execution, superscalar processor 



2 Uncoas!^ 

Hideki Ando, Chikako Nakanishi, Tetsuya Hara, Masao Nakaya 

May 1995 ACM SIGARCH Computer Architecture News , Proceedings of the 22nd 

annual international symposium on Computer architecture, volume 23 issue 2 

Full text available: * ^ ad [» 1.50 MB) Additional Information: full Citation , absfracl , references, citlriss . index terms 

Speculative execution is execution of instructions before it is known whether these 
instructions should be executed. Compiler-based speculative execution has the potential to 
achieve both a high instruction per cycle rate and high clock rate. Pure compiler-based 
approaches, however, have greatly limited instruction scheduling due to a limited ability to 
handle side effects of speculative execution. Significant performance improvement is, thus, 
difficult in non-numerical applications. This paper ... 

3 An„out»of-prder.execut^ 
Bich C. Le 

October 1998 Proceedings of the eighth international conference on Architectural 

support for programming languages and operating systems, volume 32 , 33 

Issue 5 , 11 

Full text available: ^.£.dgi,04.MB). Additional Information: Ml citation, abstract, .Inferences, cltirios, index terms 

A dynamic translator emulates an instruction set architecture by translating source 
instructions to native code during execution. On statically-scheduled hardware, higher 
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performance can potentially be achieved by reordering the translated instructions; however, 
this is a challenging transformation if the source architecture supports precise exception 
semantics, and the user-level program is allowed to register exception handlers. This paper 
presents a software technique which allows a translato ... 

4 Sent|ne[ schedu!ing.forVLIW 

Scott A. Mahike, William Y. Chen, Wen-mei W. Hwu, B. Ramakrishna Rau, Michael S. Schlansker 
September 1992 ACM SIGPLAN Notices , Proceedings of the fifth international 

conference on Architectural support for programming languages and 

operating systems, volume 27 issue 9 
Full text available: a ^.pdg.1. 1 22 MS). Additional Information: cjlatjofL ab^^ct, refe^nces, c&nas, index terras 

Speculative execution is an important source of parallelism for VLIW and superscalar 
processors. A serious challenge with compiler-controlled speculative execution is to 
accurately detect and report all program execution errors at the time of occurrence. In this 
paper, a set of architectural features and compile-time scheduling support referred to as 
sentinel scheduling is introduced. Sentinel scheduling provides an effective framework for 
compiler-controlled speculative ex ... 

5 DynamjcJraasM 
recoyery 4 .^ 

James C. Dehnert, Brian K. Grant, John P. Banning, Richard Johnson, Thomas Kistler, Alexander 
Klaiber, Jim Mattson 

March 2003 Proceedings of the international symposium on Code generation and 
optimization: feedback-directed and runtime optimization 

Full text available: f| odf{988.25 K3) 

Jsf Additional Information: full citation, abstrad . references 

^".Publisher .Site 

Transmeta's Crusoe microprocessor is a full, system-level implementation of the x86 
architecture, comprising a native VLIW microprocessor with a software layer, the Code 
Morphing Software (CMS), that combines an interpreter, dynamic binary translator, 
optimizer, and runtime system. In its general structure, CMS resembles other binary 
translation systems described in the literature, but it is unique in several respects. The wide 
range of PC workloads that CMS must handle gracefully in real ... 

Keywords: binary translation, dynamic optimization, dynamic translation, emulation, self- 
modifying code, speculation 



6 Jntegmted.predicated .a^ 
David I. August, Daniel A. Connors, Scott A. Mahike, John W. Sias, Kevin M. Crozier, Ben-Chung 
Cheng, Patrick R. Eaton, Qudus B. Olaniran, Wen-mei W. Hwu 

April 1998 ACM SIGARCH Computer Architecture News , Proceedings of the 25th annual 

international symposium on Computer architecture, volume 26 issue 3 

Full text available: iMS ...... |i| 

^.•^!ALkW.jyf^.^ Additional Information: iyJi. citytjoo, abstrscl rdM^noss, sltl'lSk ifA&SKMTX-s;. 

Explicitly Parallel Instruction Computing (EPIC) architectures require the compiler to express 
program instruction level parallelism directly to the hardware. EPIC techniques which enable 
the compiler to represent control speculation, data dependence speculation, and predication 
have individually been shown to be very effective. However, these techniques have not been 
studied in combination with each other. This paper presents the IMPACT EPIC Architecture to 
address the issues involved in design ... 
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June 2004 ACM SIGPLAN Notices , Proceedings of the 2004 ACM SXGPLAN/SIGBED 

conference on Languages, compilers, and tools, Volume 39 issue 7 
Full text available: * g[ pdff 609.97 KB) Additional Information: full citation , .abstract , references , index terms 

This paper evaluates managing the processor's datapath-width at the compiler level by 
means of exploiting dynamic narrow-width operands. We capitalize on the large occurrence 
of these operands in multimedia programs to build static narrow-width regions that may be 
directly exposed to the compiler. We propose to augment the ISA with instructions directly 
exposing the datapath and the register widths to the compiler. Simple exception 
management allows this exposition to be only speculative. In thi ... 

Keywords: clock-gating, compiler, energy management, narrow-width regions, 
reconfigurable computing, speculative execution 



8 Superscalar design: Cherry: checkpointed early resource recycling in out-of-order 
ffliGroprocessprs 

Jos6 F. Martinez, Jose Renau, Michael C. Huang, Milos Prvulovic, Josep Torrellas 
November 2002 Proceedings of the 35th annual ACM/IEEE international symposium on 
Microarchitecture 

Full text available: ^^/(.j^p j^iH Additional Information: full cjfalion , abstract , references , cfeas . index 
fiwMherilte terms, review 

This paper presents CHeckpointed Early Resource Recycling (Cherry), a hybrid mode of 
execution based on ROB and checkpointing that decouples resource recycling and instruction 
retirement. Resources are recycled early, resulting in a more efficient utilization. Cherry 
relies on state checkpointing and rollback to service exceptions for instructions whose 
resources have been recycled. Cherry leverages the ROB to (1) not require in-order 
execution as a fallback mechanism, (2) allow memory re ... 

Dan Ernst, Todd Austin 

May 2002 ACM SIGARCH Computer Architecture News, volume 30 issue 2 

Full text available: — jf| 

^£aUu2.M3i.*xk Additional Information: MLcitatjon, abstract, references, citings, index terms 
Publisher Site 

An increasingly large portion of scheduler latency is derived from the monolithic content 
addressable memory (CAM) arrays accessed during instruction wakeup. The performance of 
the scheduler can be improved by decreasing the number of tag comparisons necessary to 
schedule instructions. Using detailed simulation-based analyses, we find that most 
instructions enter the window with at least one of their input operands already available. By 
putting these instructions into specialized windows with fe ... 

Keywords: dynamic scheduling, complexity-effective architecture, low-power architecture, 
last-tag prediction 
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